Predicting eukaryotic transcriptional cooperativity by Bayesian network integration of genome-wide data

نویسندگان

  • Yong Wang
  • Xiang-Sun Zhang
  • Yu Xia
چکیده

Transcriptional cooperativity among several transcription factors (TFs) is believed to be the main mechanism of complexity and precision in transcriptional regulatory programs. Here, we present a Bayesian network framework to reconstruct a high-confidence whole-genome map of transcriptional cooperativity in Saccharomyces cerevisiae by integrating a comprehensive list of 15 genomic features. We design a Bayesian network structure to capture the dominant correlations among features and TF cooperativity, and introduce a supervised learning framework with a well-constructed gold-standard dataset. This framework allows us to assess the predictive power of each genomic feature, validate the superior performance of our Bayesian network compared to alternative methods, and integrate genomic features for optimal TF cooperativity prediction. Data integration reveals 159 high-confidence predicted cooperative relationships among 105 TFs, most of which are subsequently validated by literature search. The existing and predicted transcriptional cooperativities can be grouped into three categories based on the combination patterns of the genomic features, providing further biological insights into the different types of TF cooperativity. Our methodology is the first supervised learning approach for predicting transcriptional cooperativity, compares favorably to alternative unsupervised methodologies, and can be applied to other genomic data integration tasks where high-quality gold-standard positive data are scarce.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CCAT: Combinatorial Code Analysis Tool for transcriptional regulation

Combinatorial interplay among transcription factors (TFs) is an important mechanism by which transcriptional regulatory specificity is achieved. However, despite the increasing number of TFs for which either binding specificities or genome-wide occupancy data are known, knowledge about cooperativity between TFs remains limited. To address this, we developed a computational framework for predict...

متن کامل

Predicting gene regulation by sigma factors in Bacillus subtilis from genome-wide data

MOTIVATION Sigma factors regulate the expression of genes in Bacillus subtilis at the transcriptional level. We assess the accuracy of a fold-change analysis, Bayesian networks, dynamic models and supervised learning based on coregulation in predicting gene regulation by sigma factors from gene expression data. To improve the prediction accuracy, we combine sequence information with expression ...

متن کامل

I-10: Transcriptomics in Oocyte Mediated Cellular Reprogramming

a:4:{s:10:"Background";s:1707:"Early embryonic development in mammals begins in transcriptional silence with an oocyte-mediated transcriptional reprogramming of parental gametes occurs during a so called across-the-board process of “erase-and-rebuild”. In this process, the parental transcription programs are erased long before (maternal) or soon thereafter (paternal) fertilization to generate a...

متن کامل

Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae.

We describe the use of model-driven analysis of multiple data types relevant to transcriptional regulation of metabolism to discover novel regulatory mechanisms in Saccharomyces cerevisiae. We have reconstructed the nutrient-controlled transcriptional regulatory network controlling metabolism in S. cerevisiae consisting of 55 transcription factors regulating 750 metabolic genes, based on inform...

متن کامل

The Impact of Different Genetic Architectures on Accuracy of Genomic Selection Using Three Bayesian Methods

Genome-wide evaluation uses the associations of a large number of single nucleotide polymorphism (SNP) markers across the whole genome and then combines the statistical methods with genomic data to predict the genetic values. Genomic predictions relieson linkage disequilibrium (LD) between genetic markers and quantitative trait loci (QTL) in a population. Methods that use all markers simultaneo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2009